Multicloud Resource Allocation: Cooperation, Optimization and Sharing

نویسندگان

  • Hao ZHUANG
  • Rameez Rahman
  • Hanjie Pan
  • Cheng Wang
  • Bin Jin
  • Jingjing Wang
  • Runwei Zhang
  • Xinchao Wang
  • Bin Ding
  • Wei Zhuo
  • Shenqi Xie
چکیده

Nowadays our daily life is not only powered by water, electricity, gas and telephony but by “cloud” as well. Due to the high penetration of cloud-based services or applications into every aspect of our life and the unprecedented increase of digital data, cloud computing becomes the 5th utility that processes and stores these applications and data. Big cloud vendors such as Amazon, Microsoft and Google have built large-scale centralized data centers to achieve economies of scale, on-demand resource provisioning, high resource availability and elasticity. However, those massive data centers also bring about many other problems, e.g., bandwidth bottlenecks, privacy, security, huge energy consumption, legal and physical vulnerabilities. One of the possible solutions for those problems is to employ multicloud architectures. In this thesis, our work provides research contributions to multicloud resource allocation from three perspectives of cooperation, optimization and data sharing. We address the following problems in the multicloud: how resource providers cooperate in a multicloud, how to reduce information leakage in a multicloud storage system and how to share the big data in a cost-effective way. More specifically, we make the following contributions: Cooperation in the decentralized cloud. Recently due to increasing concerns on the privacy and data control, many small data centers (SDCs) established by different providers are emerging in an attempt to meet demand locally. However, SDCs can suffer from resource in-elasticity due to their relatively scarce resources, resulting in a loss of performance and revenue. In this work, we propose a decentralized cloud model in which a group of SDCs can cooperate with each other to improve performance. Moreover, we design a general strategy function for SDCs to evaluate the performance of cooperation based on different dimensions of resource sharing. Through extensive simulations using a realistic data center model, we show that the strategies based on reciprocity are more effective than other strategies, e.g., those using prediction based on historical data. Our results show that the reciprocity-based strategy can thrive in a heterogeneous environment with competing strategies. Multicloud optimization on information leakage. Many schemes have been recently advanced for storing data on multiple clouds. Distributing data over different cloud storage providers (CSPs) automatically provides users with a certain degree of information leakage control, for no single point of attack can leak all the information. However, unplanned distribution of data chunks can lead to high information disclosure even while using multiple clouds. In this work, we firstly study an important information leakage problem caused by unplanned data distribution in multicloud storage services. Then, we present StoreSim, an information leakage aware storage system in multicloud. StoreSim aims to store syntactically similar data on the same cloud, thereby minimizing the user’s information leakage across multiple clouds. We design an approximate algorithm to efficiently generate similaritypreserving signatures for data chunks based on MinHash and Bloom filter, and also design a function to compute the information leakage based on these signatures. Next, we present an effective storage plan generation algorithm based on clustering for distributing data chunks with minimal information leakage across multiple clouds. Finally, we evaluate our scheme using two real datasets from Wikipedia and GitHub. We show that our scheme can reduce the information leakage by up to 60% compared to unplanned placement. Furthermore, our analysis in terms of system attackability demonstrates that our scheme makes attacks on information much more complex. Smart data sharing. Moving large amounts of distributed data into the cloud or from one cloud to another can incur high costs in both time and bandwidth. The optimization on data sharing in the multicloud can be conducted from two different angles: • Inter-cloud scheduling. Existing centralized solutions for data sharing such as Dropbox replicating data to all interested parties is prohibitively costly, given the large size of datasets. A more practical solution is to use a Peer-to-Peer (P2P) approach to replicate data in a self-organized manner. However, existing P2P approaches focus on minimizing downloading time without taking into account the bandwidth cost. In this work, we present CoShare, a P2P inspired decentralized cost effective sharing system for data replication. CoShare allows users to specify their requirements on data sharing tasks and maps these requirements into resource requirements for data transfer. Through extensive simulations, we demonstrate that CoShare finds the desirable tradeoffs for a given cost and performance while varying user requirements and request arrival rates. • Making big data smaller. The sheer size of big data imposes great challenges on storing, sharing and processing such data in the multicloud. These challenges can be addressed by data summarization which transforms the original dataset into a smaller, yet still useful subset. In this work, we take Twitter data and its applications based on topic models as a case study. We aim to reduce the size of the Twitter dataset while preserving topics in the original big dataset. Existing work finds such small subsets with objective functions based on data properties such as representativeness or informativeness but does not exploit social contexts, which are distinct characteristics of social data. Through analyzing Twitter data, we discover two social contexts which are important for topic generation and dissemination, namely (i) CrowdExp topic score that captures the influence of both the crowd and the expert users in Twitter and (ii) Retweet topic score that captures the influence of Twitter users’ actions. We conduct extensive experiments on two real-world Twitter datasets using two applications. The experimental results show that, by leveraging social contexts, our proposed solution can reduce the total size of data without the performance degradation on topic-related applications.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrated modeling and solving the resource allocation problem and task scheduling in the cloud computing environment

Cloud computing is considered to be a new service provider technology for users and businesses. However, the cloud environment is facing a number of challenges. Resource allocation in a way that is optimum for users and cloud providers is difficult because of lack of data sharing between them. On the other hand, job scheduling is a basic issue and at the same time a big challenge in reaching hi...

متن کامل

Cognitive Radio Network Modeling Using Game Theoretic Approach for Effective Resource Allocation

Spectrum Sensing, Resource Sharing and Enhanced Multihop Modelling scenarios have seen much change in the last decade, leading to better probability of choosing near perfection assumptions and working models for Cognitive Radio Networks. In continuation to the contributions of the world for the improvements of intelligent radio networks, this project is aimed at developing an optimal solution f...

متن کامل

An Intelligent Algorithm for Optimization of Resource Allocation Problem by Considering Human Error in an Emergency Department

Human error is a significant and ever-growing problem in the healthcare sector. In this study, resource allocation problem is considered along with human errors to optimize utilization of resources in an emergency department. The algorithm is composed of simulation, artificial neural network (ANN), design of experiment (DOE) and fuzzy data envelopment analysis (FDEA). It is a multi-response opt...

متن کامل

Grid Computing based on Game Optimization Theory for Networks Scheduling

The resource sharing mechanism is introduced into grid computing algorithm so as to solve complex computational tasks in heterogeneous network-computing problem. However, in the Grid environment, it is required for the available resource from network to reasonably schedule and coordinate, which can get a good workflow and an appropriate network performance and network response time. In order to...

متن کامل

Modified Aco Algorithm for Resource Allocation in Cloud Computing Environment

In Cloud computing, private cloud is a model where resource sharing is becoming more popular now-a-days. The importance of resource sharing has lead to the development of many algorithms. The existing works in the field of resource sharing used many optimization techniques. The two techniques namely Service Composition Optimal Selection (SCOS) and Optimal Allocation of Computing Resources (OACR...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017